Search CORE

185 research outputs found

MEGASAT: automated inference of microsatellite genotypes from sequence data

Author: Beiko Robert G
Bentzen Paul
Bradbury Ian R
Fraser Bonnie A
Paterson Ian G
Ravindran Praveen Nadukkalam
Reznick David
Watson Beth
Zhan Luyao
Publication venue: 'Wiley'
Publication date: 19/07/2016
Field of study

MEGASAT is software that enables genotyping of microsatellite loci using next-generation sequencing data. Microsatellites are amplified in large multiplexes, and then sequenced in pooled amplicons. MEGASAT reads sequence files and automatically scores microsatellite genotypes. It uses fuzzy matches to allow for sequencing errors and applies decision rules to account for amplification artefacts, including nontarget amplification products, replication slippage during PCR (amplification stutter) and differential amplification of alleles. An important fea- ture of MEGASAT is the generation of histograms of the length–frequency distributions of amplification products for each locus and each individual. These histograms, analogous to electropherograms traditionally used to score microsatellite genotypes, enable rapid evaluation and editing of automatically scored genotypes. MEGASAT is written in Perl, runs on Windows, Mac OS X and Linux systems, and includes a simple graphical user interface. We demon- strate MEGASAT using data from guppy, Poecilia reticulata. We genotype 1024 guppies at 43 microsatellites per run on an Illumina MiSeq sequencer. We evaluated the accuracy of automatically called genotypes using two methods, based on pedigree and repeat genotyping data, and obtained estimates of mean genotyping error rates of 0.021 and 0.012. In both estimates, three loci accounted for a disproportionate fraction of genotyping errors; conversely, 26 loci were scored with 0–1 detected error (error rate ≤0.007). Our results show that with appropriate selection of loci, automated genotyping of microsatellite loci can be achieved with very high throughput, low genotyping error and very low genotyping costs

Crossref

Sussex Research Online

Parsimonious Inference of Hybridization in the Presence of Incomplete Lineage Sorting

Author: Arnold
Bansal
Barton
Beiko
Bloomquist
Bordewich
Cranston
Degnan
Edwards
Felsenstein
Hobolth
Hudson
Huelsenbeck
Joly
Kingman
Kubatko
Kuo
Liu
Luay Nakhleh
MacLeod
Maddison
Maddison
Mallet
Mallet
Meng
Nakhleh
Nakhleh
Nakhleh
Nakhleh
Pollard
R. Matthew Barnett
Rambaut
Rannala
Rasmussen
Rieseberg
Rokas
Swofford
Syring
Takuno
Than
Than
Than
Than
Than
Than
White
Yu
Yu
Yu
Yu
Yu
Yun Yu
Publication venue
Publication date: 01/01/2013
Field of study

Hybridization plays an important evolutionary role in several groups of organisms. A phylogenetic approach to detect hybridization entails sequencing multiple loci across the genomes of a group of species of interest, reconstructing their gene trees, and taking their differences as indicators of hybridization. However, methods that follow this approach mostly ignore population effects, such as incomplete lineage sorting (ILS). Given that hybridization occurs between closely related organisms, ILS may very well be at play and, hence, must be accounted for in the analysis framework. To address this issue, we present a parsimony criterion for reconciling gene trees within the branches of a phylogenetic network, and a local search heuristic for inferring phylogenetic networks from collections of gene-tree topologies under this criterion. This framework enables phylogenetic analyses while accounting for both hybridization and ILS. Further, we propose two techniques for incorporating information about uncertainty in gene-tree estimates. Our simulation studies demonstrate the good performance of our framework in terms of identifying the location of hybridization events, as well as estimating the proportions of genes that underwent hybridization. Also, our framework shows good performance in terms of efficiency on handling large data sets in our experiments. Further, in analyzing a yeast data set, we demonstrate issues that arise when analyzing real data sets. While a probabilistic approach was recently introduced for this problem, and while parsimonious reconciliations have accuracy issues under certain settings, our parsimony framework provides a much more computationally efficient technique for this type of analysis. Our framework now allows for genome-wide scans for hybridization, while also accounting for ILS

Crossref

PubMed Central

DSpace at Rice University

A new, fast algorithm for detecting protein coevolution using maximum compatible cliques

Author: A Rodionov
A Valencia
AK Ramani
Alex Rodionov
Alexandr Bezginov
AM Altenhoff
D MacLeod
D Robinson
Elisabeth RM Tillier
ERM Tillier
ERM Tillier
F Pazos
F Pazos
GW Clark
J Felsenstein
J Felsenstein
Jonathan Rose
K Katoh
MK Kuhner
PRJ Östergård
R Jothi
RG Beiko
RM Karp
S Razick
T Sato
V Soria-Carrasco
W Li
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The MatrixMatchMaker algorithm was recently introduced to detect the similarity between phylogenetic trees and thus the coevolution between proteins. MMM finds the largest common submatrices between pairs of phylogenetic distance matrices, and has numerous advantages over existing methods of coevolution detection. However, these advantages came at the cost of a very long execution time. Results In this paper, we show that the problem of finding the maximum submatrix reduces to a multiple maximum clique subproblem on a graph of protein pairs. This allowed us to develop a new algorithm and program implementation, MMMvII, which achieved more than 600× speedup with comparable accuracy to the original MMM. Conclusions MMMvII will thus allow for more more extensive and intricate analyses of coevolution. Availability An implementation of the MMMvII algorithm is available at: <url>http://www.uhnresearch.ca/labs/tillier/MMMWEBvII/MMMWEBvII.php</url></p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

GRISOTTO: A greedy approach to improve combinatorial algorithms for motif discovery with prior knowledge

Author: A Valouev
Alexandra M Carvalho
AM Carvalho
AP Fejes
Arlindo L Oliveira
C Deremble
C Lee
CT Harbison
D Ucar
E Segal
E Valen
F Daenen
G Paillard
G Paillard
G Pavesi
GC Yuan
I Lafontaine
I Lafontaine
I Lafontaine
IV Kulakovskiy
JV Ponomarenko
KD MacIsaac
L Marsan
L Narlikar
L Narlikar
M Hu
M Kellis
MF Sagot
N Pisanti
R Gordân
R Gordân
R Gordân
R Pudimat
R Siddharthan
RA O'Flanagan
RG Beiko
S Sinha
T Wang
TL Bailey
TL Bailey
V Matys
WW Wasserman
X Chen
Y Liu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Position-specific priors (PSP) have been used with success to boost EM and Gibbs sampler-based motif discovery algorithms. PSP information has been computed from different sources, including orthologous conservation, DNA duplex stability, and nucleosome positioning. The use of prior information has not yet been used in the context of combinatorial algorithms. Moreover, priors have been used only independently, and the gain of combining priors from different sources has not yet been studied. Results We extend RISOTTO, a combinatorial algorithm for motif discovery, by post-processing its output with a greedy procedure that uses prior information. PSP's from different sources are combined into a scoring criterion that guides the greedy search procedure. The resulting method, called GRISOTTO, was evaluated over 156 yeast TF ChIP-chip sequence-sets commonly used to benchmark prior-based motif discovery algorithms. Results show that GRISOTTO is at least as accurate as other twelve state-of-the-art approaches for the same task, even without combining priors. Furthermore, by considering combined priors, GRISOTTO is considerably more accurate than the state-of-the-art approaches for the same task. We also show that PSP's improve GRISOTTO ability to retrieve motifs from mouse ChiP-seq data, indicating that the proposed algorithm can be applied to data from a different technology and for a higher eukaryote. Conclusions The conclusions of this work are twofold. First, post-processing the output of combinatorial algorithms by incorporating prior information leads to a very efficient and effective motif discovery method. Second, combining priors from different sources is even more beneficial than considering them separately.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Phylogenomic Analysis of Marine Roseobacters

Author: A Buchan
A Stamatakis
C Dutta
Carl Kingsford
Cathy H. Wu
CH Wu
CJ Creevey
CM Thomas
D Posada
DF Robinso
E Bapteste
E Lerat
E Susko
F Abascal
G Bouxin
G Talavera
GT Taylor
H Ochman
H Shimodaira
H Shimodaira
HA Schmidt
Hongzhan Huang
I Wagner-Dobler
I Wagner-Dobler
J Bergsten
J Castresana
J Felsenstein
JA Eisen
JD Thompson
JP Gogarten
JP Gogarten
JP Huelsenbeck
JR Brown
Kai Tang
KH Tang
L Li
LM Schouls
MA Moran
MS Poptsova
N Galtier
Nianzhi Jiao
NZ Jiao
O Zhaxybayeva
O Zhaxybayeva
R Jain
R Seshadri
RD Page
RG Beiko
RG Beiko
RL Charlebois
RL Tatusov
RS Poretsky
S Guindon
SF Altschul
SJ Sorensen
SM Sowell
T Brinkhoff
T Shi
TR Miller
V Daubin
VM Markowitz
Y Zhang
Y Zhao
ZS Kolber
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Background: Members of the Roseobacter clade which play a key role in the biogeochemical cycles of the ocean are diverse and abundant, comprising 10–25 % of the bacterioplankton in most marine surface waters. The rapid accumulation of whole-genome sequence data for the Roseobacter clade allows us to obtain a clearer picture of its evolution. Methodology/Principal Findings: In this study about 1,200 likely orthologous protein families were identified from 17 Roseobacter bacteria genomes. Functional annotations for these genes are provided by iProClass. Phylogenetic trees were constructed for each gene using maximum likelihood (ML) and neighbor joining (NJ). Putative organismal phylogenetic trees were built with phylogenomic methods. These trees were compared and analyzed using principal coordinates analysis (PCoA), approximately unbiased (AU) and Shimodaira–Hasegawa (SH) tests. A core set of 694 genes with vertical descent signal that are resistant to horizontal gene transfer (HGT) is used to reconstruct a robust organismal phylogeny. In addition, we also discovered the most likely 109 HGT genes. The core set contains genes that encode ribosomal apparatus, ABC transporters and chaperones often found in the environmental metagenomic and metatranscriptomic data. These genes in the core set are spread out uniformly among the various functional classes and biological processes. Conclusions/Significance: Here we report a new multigene-derived phylogenetic tree of the Roseobacter clade. Of particular interest is the HGT of eleven genes involved in vitamin B12 synthesis as well as key enzynmes fo

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Xiamen University Institutional Repository

Reproducing the manual annotation of multiple sequence alignments using a SVM classifier

Author: Allan Lavell
Altschul
Andrew J. Roger
Beiko
Bradley
Castresana
Chang
Christian Blouin
Do
Dutheil
Eddy
Edgar
Edward Susko
Fawcett
Feng
Finn
Hall
Holmes
Jones
Landan
Landan
Lassmann
Lassmann
Lunter
Löytynoja
Needleman
Notredame
Notredame
Nuin
Ogdenw
Pei
R Development Core Team
Roettger
Saitou
Scott Perry
Shan
Sing
Smith
Thompson
Thompson
Thompson
Van Walle
Wong
Publication venue: Oxford University Press
Publication date
Field of study

Motivation: Aligning protein sequences with the best possible accuracy requires sophisticated algorithms. Since the optimal alignment is not guaranteed to be the correct one, it is expected that even the best alignment will contain sites that do not respect the assumption of positional homology. Because formulating rules to identify these sites is difficult, it is common practice to manually remove them. Although considered necessary in some cases, manual editing is time consuming and not reproducible. We present here an automated editing method based on the classification of ‘valid’ and ‘invalid’ sites

Crossref

PubMed Central

FindAScienceBerth: Connecting Underrepresented Groups in Marine Science with Available Berths on Scientific Research Vessels

Author: Anderson Madeleine
Darlington Eleanor
Fielding Sofie
Fisher Ben
Hendry Katharine R
Joshi Siddhi
Maas Beiko
Marzocchi Alice
McGregor Anna
Sierdzan Katie
Van Ladeghem Katrien J. J.
Publication venue
Publication date
Field of study

No abstract available

Enlighten

Phylogenetic Detection of Recombination with a Bayesian Prior on the Distance between Trees

Author: A Gelman
AC Siepel
AJ Drummond
BL Allen
C Wiuf
CX Chan
D Bryant
D Husmeier
D Husmeier
D Husmeier
D MacLeod
D Posada
D Posada
DF Robinson
DL Swofford
F Al-Awadhi
F Fang
F Ge
F Ronquist
G Altekar
G Hickey
G McVean
GB Golding
GF Weiller
H Kishino
Hirohisa Kishino
I DiMatteo
J Felsenstein
J Felsenstein
J Hein
J Hein
J Hey
JL Thorne
JP Huelsenbeck
L Nakhleh
L Nakhleh
Leonardo de Oliveira Martins
M Hasegawa
M Sierra
M Steel
MA Suchard
MA Suchard
Mark Isalan
MHS Jotun Hein
MK Kuhner
ML Rajaram
MO Salminen
MT Hallett
NC Grassly
P Awadalla
P Fearnhead
P Lefeuvre
R Nielsen
RC Griffiths
RG Beiko
RG Beiko
RR Hudson
RR Hudson
VN Minin
VN Minin
W Hordijk
YS Song
YS Song
YS Song
Z Yang
Élcio Leal
Publication venue: Public Library of Science
Publication date: 07/06/2008
Field of study

Genomic regions participating in recombination events may support distinct topologies, and phylogenetic analyses should incorporate this heterogeneity. Existing phylogenetic methods for recombination detection are challenged by the enormous number of possible topologies, even for a moderate number of taxa. If, however, the detection analysis is conducted independently between each putative recombinant sequence and a set of reference parentals, potential recombinations between the recombinants are neglected. In this context, a recombination hotspot can be inferred in phylogenetic analyses if we observe several consecutive breakpoints. We developed a distance measure between unrooted topologies that closely resembles the number of recombinations. By introducing a prior distribution on these recombination distances, a Bayesian hierarchical model was devised to detect phylogenetic inconsistencies occurring due to recombinations. This model relaxes the assumption of known parental sequences, still common in HIV analysis, allowing the entire dataset to be analyzed at once. On simulated datasets with up to 16 taxa, our method correctly detected recombination breakpoints and the number of recombination events for each breakpoint. The procedure is robust to rate and transition∶transversion heterogeneities for simulations with and without recombination. This recombination distance is related to recombination hotspots. Applying this procedure to a genomic HIV-1 dataset, we found evidence for hotspots and de novo recombination

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Repositório Institucional UNIFESP

Directory of Open Access Journals

PubMed Central

Spiral - Imperial College Digital Repository

Intact learning and memory in rats following treatment with the dual orexin receptor antagonist almorexant

Author: AV Telegdy
AV Terry Jr
B Rasch
BE Jones
BH Rasch
BJ Everitt
C Brisbare-Roch
C Peyron
C Smith
CB Saper
DA Hamilton
E Akbari
E Akbari
E Akbari
E Linstow Roloff von
EK Perry
François Jenck
Hendrik Dietrich
HR Smith
IQ Whishaw
J Beiko
J Born
K Ohno
L Genzel
L Lecea de
LB Jaeger
MD Lindner
MG Lee
MJ Wayner
MP Walker
P Maquet
P Riekkinen Jr
P Riekkinen Jr
R Stickgold
R Stickgold
RGM Morris
RGM Morris
RK McNamara
RT Bartus
S Aou
S Gais
TS Kilduff
V Sterpenich
W Plihal
Publication venue: Springer-Verlag
Publication date: 01/01/2010
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Reductive Evolution of the Mitochondrial Processing Peptidases of the Unicellular Parasites Trichomonas vaginalis and Giardia intestinalis

Author: A Mukhopadhyay
A Sali
AB Taylor
Anna Matušková
CB Do
Daniel Eliot Goldberg
Eva Kutějová
F Abascal
G Talavera
HG Morrison
Ivan Hrdý
J Janata
J Tovar
Jan Tachezy
Jiří Janata
JM Carlton
JN Timmis
JP Bollback
K Bryson
K Kojima
Lenka Horváthová
M Arretz
Marián Novotný
MP Yaffe
MT Brown
NA Baker
O Gakh
Ondřej Šmíd
P Doležal
P Doležal
PG Foster
PJ Bradley
PO Lewis
R Rodriguez
RA Laskowski
RA Laskowski
RC Edgar
RG Beiko
Robert P. Hirt
S Kitada
S Kitada
SD Dyall
Simon R. Harris
T. Martin Embley
TM Embley
Tomáš Kučera
W Neupert
Y Nagao
Publication venue: Public Library of Science
Publication date: 01/12/2008
Field of study

Mitochondrial processing peptidases are heterodimeric enzymes (α/βMPP) that play an essential role in mitochondrial biogenesis by recognizing and cleaving the targeting presequences of nuclear-encoded mitochondrial proteins. The two subunits are paralogues that probably evolved by duplication of a gene for a monomeric metallopeptidase from the endosymbiotic ancestor of mitochondria. Here, we characterize the MPP-like proteins from two important human parasites that contain highly reduced versions of mitochondria, the mitosomes of Giardia intestinalis and the hydrogenosomes of Trichomonas vaginalis. Our biochemical characterization of recombinant proteins showed that, contrary to a recent report, the Trichomonas processing peptidase functions efficiently as an α/β heterodimer. By contrast, and so far uniquely among eukaryotes, the Giardia processing peptidase functions as a monomer comprising a single βMPP-like catalytic subunit. The structure and surface charge distribution of the Giardia processing peptidase predicted from a 3-D protein model appear to have co-evolved with the properties of Giardia mitosomal targeting sequences, which, unlike classic mitochondrial targeting signals, are typically short and impoverished in positively charged residues. The majority of hydrogenosomal presequences resemble those of mitosomes, but longer, positively charged mitochondrial-type presequences were also identified, consistent with the retention of the Trichomonas αMPP-like subunit. Our computational and experimental/functional analyses reveal that the divergent processing peptidases of Giardia mitosomes and Trichomonas hydrogenosomes evolved from the same ancestral heterodimeric α/βMPP metallopeptidase as did the classic mitochondrial enzyme. The unique monomeric structure of the Giardia enzyme, and the co-evolving properties of the Giardia enzyme and substrate, provide a compelling example of the power of reductive evolution to shape parasite biology

Crossref

Directory of Open Access Journals

PubMed Central